Making Backyards Affordable for All: A YIMBY Analysis
Author
Your Name
Published
October 19, 2025
Introduction
Housing affordability remains one of the most pressing challenges facing American cities today. This analysis examines metropolitan areas across the United States to identify “YIMBY” (Yes In My Backyard) success stories—cities that have effectively addressed housing affordability through permissive building policies. Using data from the US Census Bureau and Bureau of Labor Statistics, we develop metrics to measure rent burden and housing growth, ultimately identifying which cities have made meaningful progress in making housing more affordable.
Data Acquisition
Task 1: Data Import
Show code
# Load required packageslibrary(tidyverse)library(DT)library(scales)library(tidycensus)library(glue)library(readxl)library(httr2)library(rvest)library(gghighlight)# Set up data directoryif(!dir.exists(file.path("data", "mp02"))){dir.create(file.path("data", "mp02"), showWarnings=FALSE, recursive=TRUE)}# Helper function for package installationensure_package <-function(pkg){ pkg <-as.character(substitute(pkg))options(repos =c(CRAN ="https://cloud.r-project.org"))if(!require(pkg, character.only=TRUE, quietly=TRUE)) install.packages(pkg)stopifnot(require(pkg, character.only=TRUE, quietly=TRUE))}# Census API key setup (first time only)# tidycensus::census_api_key("your_key_here", install = TRUE)# Helper function to get ACS data across multiple yearsget_acs_all_years <-function(variable, geography="cbsa",start_year=2009, end_year=2023){ fname <-glue("{variable}_{geography}_{start_year}_{end_year}.csv") fname <-file.path("data", "mp02", fname)if(!file.exists(fname)){ YEARS <-seq(start_year, end_year) YEARS <- YEARS[YEARS !=2020] # Drop 2020 - No survey (covid) ALL_DATA <-map(YEARS, function(yy){ tidycensus::get_acs(geography, variable, year=yy, survey="acs1") |>mutate(year=yy) |>select(-moe, -variable) |>rename(!!variable := estimate) }) |>bind_rows()write_csv(ALL_DATA, fname) }read_csv(fname, show_col_types=FALSE)}# Load Census ACS dataINCOME <-get_acs_all_years("B19013_001") |>rename(household_income = B19013_001)RENT <-get_acs_all_years("B25064_001") |>rename(monthly_rent = B25064_001)POPULATION <-get_acs_all_years("B01003_001") |>rename(population = B01003_001)HOUSEHOLDS <-get_acs_all_years("B11001_001") |>rename(households = B11001_001)
Show code
# Helper function to get building permits dataget_building_permits <-function(start_year =2009, end_year =2023){ fname <-glue("housing_units_{start_year}_{end_year}.csv") fname <-file.path("data", "mp02", fname)if(!file.exists(fname)){# Historical data (2009-2018) from text files HISTORICAL_YEARS <-seq(start_year, 2018) HISTORICAL_DATA <-map(HISTORICAL_YEARS, function(yy){ historical_url <-glue("https://www.census.gov/construction/bps/txt/tb3u{yy}.txt") LINES <-readLines(historical_url)[-c(1:11)] CBSA_LINES <-str_detect(LINES, "^[[:digit:]]") CBSA <-as.integer(str_sub(LINES[CBSA_LINES], 5, 10)) PERMIT_LINES <-str_detect(str_sub(LINES, 48, 53), "[[:digit:]]") PERMITS <-as.integer(str_sub(LINES[PERMIT_LINES], 48, 53))data.frame(CBSA = CBSA,new_housing_units_permitted = PERMITS, year = yy) }) |>bind_rows()# Current data (2019-2023) from Excel files CURRENT_YEARS <-seq(2019, end_year) CURRENT_DATA <-map(CURRENT_YEARS, function(yy){ current_url <-glue("https://www.census.gov/construction/bps/xls/msaannual_{yy}99.xls") temp <-tempfile()download.file(current_url, destfile = temp)# Fallback function for different Excel formats fallback <-function(.f1, .f2){function(...){tryCatch(.f1(...), error=function(e) .f2(...)) } } reader <-fallback(readxl::read_xlsx, readxl::read_xls)reader(temp, skip=5) |>na.omit() |>select(CBSA, Total) |>mutate(year = yy) |>rename(new_housing_units_permitted = Total) }) |>bind_rows() ALL_DATA <-rbind(HISTORICAL_DATA, CURRENT_DATA)write_csv(ALL_DATA, fname) }read_csv(fname, show_col_types=FALSE)}PERMITS <-get_building_permits()
Answer: The Houston-Sugar Land-Baytown, TX Metro Area, Houston-The Woodlands-Sugar Land, TX Metro Area, Houston-Pasadena-The Woodlands, TX Metro Area permitted the largest number of new housing units between 2010 and 2019, with a total of 482,075, 482,075, 482,075 units.
Question 2: Albuquerque Peak Permitting Year
In what year did Albuquerque permit the most new housing units?
Answer: Albuquerque permitted the most new housing units in 2021, with 4,021 permits issued.
Note: The data shows a notable decline in 2021, likely related to COVID-19 impacts on construction activity.
Question 3: Highest Average Income State (2015)
Which state had the highest average household income in 2015?
Show code
state_df <-data.frame(abb =c(state.abb, "DC", "PR"),name =c(state.name, "District of Columbia", "Puerto Rico"))q3_result <- INCOME |>filter(year ==2015) |>left_join(HOUSEHOLDS |>filter(year ==2015), by =c("GEOID", "NAME", "year")) |>mutate(state =str_extract(NAME, ", (.{2})", group =1),total_income = household_income * households) |>group_by(state) |>summarize(total_income =sum(total_income, na.rm =TRUE),total_households =sum(households, na.rm =TRUE),avg_income = total_income / total_households,.groups ="drop" ) |>arrange(desc(avg_income)) |>left_join(state_df, by =c("state"="abb"))q3_result |>slice(1:10) |>select(name, avg_income, total_households) |>mutate(avg_income =dollar(avg_income, accuracy =1),total_households =comma(total_households, accuracy =1) ) |>datatable(caption ="Top 10 States by Average Household Income (2015)",options =list(pageLength =10, dom ='t'),colnames =c("State", "Avg Income", "Total Households"),rownames =FALSE )
Answer: District of Columbia had the highest average household income in 2015 at $93,294.
Question 4: NYC Data Scientists Peak
In what year did New York City last have the most data scientists?
Show code
# Data scientists are NAICS code 5182data_scientists <- WAGES |>filter(INDUSTRY ==5182) |>mutate(std_cbsa =paste0(FIPS, "0"))q4_result <- data_scientists |>group_by(YEAR) |>slice_max(EMPLOYMENT, n =1) |>ungroup() |>left_join( POPULATION |>mutate(std_cbsa =paste0("C", GEOID)) |>select(std_cbsa, NAME) |>distinct(),by ="std_cbsa" ) |>mutate(EMPLOYMENT =comma(EMPLOYMENT, accuracy =1))q4_result |>select(YEAR, NAME, EMPLOYMENT) |>datatable(caption ="CBSA with Most Data Scientists by Year",options =list(pageLength =15),colnames =c("Year", "Metropolitan Area", "Employment"),rownames =FALSE )
Answer: NYC last had the most data scientists in 2015.
Question 5: NYC Finance Wages
What was the peak year for finance sector wages in NYC?
Show code
# Finance and insurance is NAICS code 52nyc_finance <- WAGES |>filter(str_starts(FIPS, "C35620")) |>mutate(is_finance = INDUSTRY ==52) |>group_by(YEAR) |>summarize(finance_wages =sum(TOTAL_WAGES[is_finance], na.rm =TRUE),total_wages =sum(TOTAL_WAGES, na.rm =TRUE),finance_fraction = finance_wages / total_wages,.groups ="drop" )peak_year <- nyc_finance |>slice_max(finance_fraction, n =1)nyc_finance |>mutate(finance_fraction =percent(finance_fraction, accuracy =0.1)) |>datatable(caption ="Finance Sector Wages as % of Total in NYC",options =list(pageLength =15),colnames =c("Year", "Finance Wages", "Total Wages", "Finance %"),rownames =FALSE )
Answer: The finance and insurance sector peaked at of total NYC wages in .
Task 3: Initial Visualizations
Visualization 1: Rent vs. Household Income (2009)
Show code
rent_income_2009 <- RENT |>filter(year ==2009) |>inner_join(INCOME |>filter(year ==2009), by =c("GEOID", "NAME", "year"))ggplot(rent_income_2009, aes(x = household_income, y = monthly_rent)) +geom_point(alpha =0.5, color ="#2C3E50", size =2.5) +geom_smooth(method ="lm", color ="#3498DB", se =TRUE, linewidth =1.2) +scale_x_continuous(labels =dollar_format(), limits =c(0, NA)) +scale_y_continuous(labels =dollar_format(), limits =c(0, NA)) +labs(title ="Relationship Between Household Income and Monthly Rent",subtitle ="Core-Based Statistical Areas (CBSAs) in 2009",x ="Average Household Income (Annual)",y ="Average Monthly Rent",caption ="Source: US Census Bureau, American Community Survey" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =15, color ="#2C3E50"),plot.subtitle =element_text(color ="#7F8C8D", margin =margin(b =15)),panel.grid.minor =element_blank(),panel.grid.major =element_line(color ="#ECF0F1"),axis.title =element_text(color ="#34495E") )
Interpretation: There is a strong positive relationship between household income and monthly rent across CBSAs. Higher-income areas tend to have higher rents, suggesting that housing costs scale with local economic conditions.
Visualization 2: Total vs. Healthcare Employment Over Time
Show code
healthcare_employment <- WAGES |>mutate(is_healthcare = INDUSTRY ==62,std_cbsa =paste0(FIPS, "0")) |>group_by(std_cbsa, YEAR) |>summarize(healthcare_employment =sum(EMPLOYMENT[is_healthcare], na.rm =TRUE),total_employment =sum(EMPLOYMENT, na.rm =TRUE),.groups ="drop" ) |>filter(total_employment >0, healthcare_employment >0)ggplot(healthcare_employment, aes(x = total_employment, y = healthcare_employment, color = YEAR)) +geom_point(alpha =0.5, size =2) +scale_color_viridis_c(option ="viridis") +scale_x_log10(labels =comma_format()) +scale_y_log10(labels =comma_format()) +labs(title ="Healthcare Employment vs. Total Employment Across CBSAs",subtitle ="Evolution from 2009-2023 (log-log scale)",x ="Total Employment (log scale)",y ="Healthcare & Social Services Employment (log scale)",color ="Year",caption ="Source: Bureau of Labor Statistics, Quarterly Census of Employment and Wages" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =15, color ="#2C3E50"),plot.subtitle =element_text(color ="#7F8C8D", margin =margin(b =15)),legend.position ="right",panel.grid.minor =element_blank(),panel.grid.major =element_line(color ="#ECF0F1") )
Interpretation: Healthcare employment grows proportionally with total employment across CBSAs. The color gradient shows the temporal evolution, with more recent years showing slightly higher healthcare employment shares, reflecting the sector’s continued growth.
Visualization 3: Household Size Evolution Over Time
Show code
household_size <- POPULATION |>inner_join(HOUSEHOLDS, by =c("GEOID", "NAME", "year")) |>mutate(household_size = population / households) |>filter(!is.na(household_size), household_size >0)top_cbsas <- POPULATION |>filter(year ==2019) |>slice_max(population, n =20) |>pull(GEOID)household_size_subset <- household_size |>filter(GEOID %in% top_cbsas)ggplot(household_size_subset, aes(x = year, y = household_size, group = NAME, color = NAME)) +geom_line(linewidth =1) +gghighlight( NAME %in%c("New York-Newark-Jersey City, NY-NJ-PA Metro Area","Los Angeles-Long Beach-Anaheim, CA Metro Area"),use_direct_label =TRUE,label_params =list(size =3.5, nudge_y =0.05, segment.color ="gray50") ) +scale_y_continuous(limits =c(2, 3.5)) +scale_color_manual(values =c("New York-Newark-Jersey City, NY-NJ-PA Metro Area"="#3498DB","Los Angeles-Long Beach-Anaheim, CA Metro Area"="#E74C3C" ) ) +labs(title ="Evolution of Average Household Size Over Time",subtitle ="Top 20 largest US metropolitan areas (2009-2023), highlighting NYC and LA",x ="Year",y ="Average Household Size (persons per household)",caption ="Source: US Census Bureau, American Community Survey\nNote: 2020 data unavailable due to COVID-19" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =15, color ="#2C3E50"),plot.subtitle =element_text(color ="#7F8C8D", margin =margin(b =15)),panel.grid.minor =element_blank(),panel.grid.major =element_line(color ="#ECF0F1"),legend.position ="none" )
Interpretation: Household sizes have remained relatively stable over time, hovering around 2.5-3.0 persons per household, with modest variation across metropolitan areas. This stability suggests that demographic trends in household formation have been consistent despite economic changes.
viz_data <- yimby_data |>filter(!is.na(recent_growth_index), !is.na(recent_rent_burden))viz_data <- viz_data |>mutate(is_yimby_success = GEOID %in% yimby_successes$GEOID,category =case_when( is_yimby_success ~"YIMBY Success", recent_rent_burden > median_early_burden & recent_growth_index > median_growth_index ~"High Burden, High Growth", recent_rent_burden > median_early_burden ~"High Burden, Low Growth", recent_growth_index > median_growth_index ~"Low Burden, High Growth",TRUE~"Other" ) )ggplot(viz_data, aes(x = recent_growth_index, y = recent_rent_burden, color = category, size = total_pop_growth)) +geom_point(alpha =0.6) +scale_color_manual(values =c("YIMBY Success"="#27AE60","High Burden, High Growth"="#3498DB","High Burden, Low Growth"="#E74C3C","Low Burden, High Growth"="#9B59B6","Other"="#BDC3C7" )) +scale_size_continuous(labels =percent_format(), range =c(1, 8)) +labs(title ="Housing Growth vs. Rent Burden Across US Metro Areas",subtitle ="YIMBY successes show high growth with decreasing burden",x ="Composite Housing Growth Index (2019-2023 avg)",y ="Recent Rent Burden Index (2021-2023 avg, 100 = 2009 baseline)",color ="Metro Category",size ="Total Pop. Growth\n(2009-2023)",caption ="Source: US Census Bureau ACS and Building Permits Survey" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =15, color ="#2C3E50"),plot.subtitle =element_text(color ="#7F8C8D", margin =margin(b =15)),legend.position ="right",panel.grid.minor =element_blank(),panel.grid.major =element_line(color ="#ECF0F1") )
Visualization 2: Rent Burden Change Over Time
Show code
# Get raw YIMBY successes data (before formatting)yimby_successes_raw <- yimby_data |>filter( early_rent_burden > median_early_burden, rent_burden_change <0, total_pop_growth >0, recent_growth_index > median_growth_index ) |>arrange(desc(recent_growth_index))top_yimby <- yimby_successes_raw |>slice(1:5) |>pull(GEOID)high_burden_low_growth <- yimby_data |>filter( recent_rent_burden > median_early_burden, recent_growth_index < median_growth_index ) |>slice_max(recent_rent_burden, n =3) |>pull(GEOID)comparison_cities <-c(top_yimby, high_burden_low_growth)rent_burden_comparison <- rent_burden_data |>filter(GEOID %in% comparison_cities) |>mutate(city_type =if_else(GEOID %in% top_yimby, "YIMBY Success", "High Burden City"),city_label =str_extract(NAME, "^[^,]+") )ggplot(rent_burden_comparison, aes(x = year, y = rent_burden_index, color = city_label, linetype = city_type)) +geom_line(linewidth =1.1) +geom_hline(yintercept =100, linetype ="dotted", color ="#7F8C8D", linewidth =0.8) +scale_linetype_manual(values =c("YIMBY Success"="solid", "High Burden City"="dashed")) +labs(title ="Evolution of Rent Burden: YIMBY Successes vs. High-Burden Cities",subtitle ="Index = 100 represents 2009 national average",x ="Year",y ="Rent Burden Index",color ="Metropolitan Area",linetype ="City Type",caption ="Source: US Census Bureau, American Community Survey" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =15, color ="#2C3E50"),plot.subtitle =element_text(color ="#7F8C8D", margin =margin(b =15)),legend.position ="right",panel.grid.minor =element_blank(),panel.grid.major =element_line(color ="#ECF0F1") )
Key Findings:
Our analysis identifies several metropolitan areas as YIMBY success stories. These cities started with relatively high rent burdens but managed to increase their housing supply while attracting population growth, leading to improved affordability over time. In contrast, many high-burden cities failed to build sufficient housing, resulting in persistent or worsening affordability challenges.
TO: Congressional Representatives from , and Los Angeles-Long Beach-Anaheim, CA
FROM: National YIMBY Coalition
RE: Federal YIMBY Incentive Program - Proposed Legislation
DATE: October 19, 2025
Executive Summary
We propose federal legislation to incentivize local municipalities to adopt YIMBY (Yes In My Backyard) housing policies. This program would provide grants to cities that increase housing development relative to population growth, with the goal of improving housing affordability nationwide.
Why Your Districts Need This Bill
**** (Primary Sponsor): Your city is a YIMBY success story. With a housing growth index of (well above the national median), has demonstrated that permissive zoning works. Despite starting with a rent burden index of , your city has reduced this to through consistent housing development. This bill would provide federal recognition and additional resources to continue this momentum.
Los Angeles-Long Beach-Anaheim (Co-Sponsor): Your constituents face a rent burden index of 134 (significantly above the 2009 baseline of 100). With a housing growth index of only 32, your city has struggled to build sufficient housing to meet demand. This federal program would provide both incentive funding and technical assistance to jumpstart housing development and make your city more affordable for working families.
Building Labor and Industry Support
Multiple employment sectors in both districts would directly benefit from improved housing affordability. Lower rent burdens mean more disposable income for consumers, benefiting retail and service industries. For employers, improved housing affordability helps attract and retain talent while reducing pressure for wage increases.
Proposed Metrics for Federal Funding
The program would use two key metrics to identify qualifying cities and allocate funding:
Rent Burden Index: Measures the ratio of median rent to median household income, indexed to 2009 national levels (baseline = 100). This metric identifies which cities face the greatest affordability challenges and tracks improvement over time.
Housing Growth Index: A composite measure combining:
Instantaneous growth: New housing units permitted per 1,000 residents
Rate-based growth: New housing units relative to population growth
Cities showing improvement on these metrics qualify for tiered federal grants, with the largest rewards going to communities that successfully reduce rent burden while accommodating population growth.
Call to Action
We request your co-sponsorship of this legislation and ask that you leverage your relationships with local business and labor groups to build coalition support. Our analysis shows this program would benefit millions of Americans while rewarding cities that embrace housing abundance.
Conclusion
This analysis has identified metropolitan areas across the United States that have successfully implemented YIMBY-friendly housing policies, resulting in improved affordability despite population growth. By contrast, many high-cost cities have failed to build sufficient housing, exacerbating affordability crises.
Key Takeaways:
YIMBY policies work: Cities with high housing growth rates have successfully reduced rent burdens even while experiencing population growth
The problem is measurable: Our rent burden and housing growth indices provide clear, data-driven metrics for policy evaluation
Federal incentives can help: A targeted grant program could encourage more cities to adopt pro-housing policies
The proposed federal YIMBY incentive program would reward cities that increase housing supply, providing both political cover for local leaders and financial resources to continue pro-housing policies. With support from key employment sectors and bipartisan appeal, this legislation represents a pragmatic approach to one of America’s most pressing challenges.
Appendix: Technical Notes
Data Sources
American Community Survey (ACS): Annual household income, rent, population, and household data from 2009-2023 (excluding 2020)
Census Building Permits Survey: New housing units permitted annually by CBSA
BLS Quarterly Census of Employment and Wages (QCEW): Employment and wage data by industry and geography
Instantaneous: Percentile rank of permits per 1,000 residents
Rate-based: Percentile rank of permits per new resident (5-year lookback)
Composite: Average of instantaneous and rate-based indices
Reproducibility
All code and data sources are documented in this report. Data is automatically downloaded and cached locally. To reproduce this analysis, run this Quarto document with R and the required packages installed.